AITopics | localization performance

Collaborating Authors

localization performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

Neural Information Processing SystemsFeb-16-2026, 01:10:09 GMT

Visual and auditory perception is crucial for observing the world. When we hear a sound, our brain will extract semantic information and locate the sounding source.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Shandong Province (0.04)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations

Guo, Zhiqing, Xi, Dongdong, Li, Songlin, Yang, Gaobo

arXiv.org Artificial IntelligenceNov-26-2025

Image manipulation localization (IML) faces a fundamental trade-off between minimizing annotation cost and achieving fine-grained localization accuracy. Existing fully-supervised IML methods depend heavily on dense pixel-level mask annotations, which limits scalability to large datasets or real-world deployment. In contrast, the majority of existing weakly-supervised IML approaches are based on image-level labels, which greatly reduce annotation effort but typically lack precise spatial localization. To address this dilemma, we propose BoxPromptIML, a novel weakly-supervised IML framework that effectively balances annotation cost and localization performance. Specifically, we propose a coarse region annotation strategy, which can generate relatively accurate manipulation masks at lower cost. To improve model efficiency and facilitate deployment, we further design an efficient lightweight student model, which learns to perform fine-grained localization through knowledge distillation from a fixed teacher model based on the Segment Anything Model (SAM). Moreover, inspired by the human subconscious memory mechanism, our feature fusion module employs a dual-guidance strategy that actively contextualizes recalled prototypical patterns with real-time observational cues derived from the input. Instead of passive feature extraction, this strategy enables a dynamic process of knowledge recollection, where long-term memory is adapted to the specific context of the current image, significantly enhancing localization accuracy and robustness. Extensive experiments across both in-distribution and out-of-distribution datasets show that Box-PromptIML outperforms or rivals fully-supervised models, while maintaining strong generalization, low annotation cost, and efficient deployment characteristics.

artificial intelligence, localization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.20359

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SP-VINS: A Hybrid Stereo Visual Inertial Navigation System based on Implicit Environmental Map

Du, Xueyu, Zhang, Lilian, Duan, Fuan, Luo, Xincan, Wang, Maosong, Wu, Wenqi, JunMao, null

arXiv.org Artificial IntelligenceNov-25-2025

Abstract-- Filter-based visual inertial navigation system (VINS) has attracted mobile-robot researchers for the good balance between accuracy and efficiency, but its limited mapping quality hampers long-term high-accuracy state estimation. T o this end, we first propose a novel filter-based stereo VINS, differing from traditional simultaneous localization and mapping (SLAM) systems based on 3D map, which performs efficient loop closure constraints with implicit environmental map composed of keyframes and 2D keypoints. Secondly, we proposed a hybrid residual filter framework that combines landmark reprojection and ray constraints to construct a unified Ja-cobian matrix for measurement updates. Finally, considering the degraded environment, we incorporated the camera-IMU extrinsic parameters into visual description to achieve online calibration. Benchmark experiments demonstrate that the proposed SP-VINS achieves high computational efficiency while maintaining long-term high-accuracy localization performance, and is superior to existing state-of-the-art (SOT A) methods.

artificial intelligence, environmental map, estimation, (17 more...)

arXiv.org Artificial Intelligence

2511.18756

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Meta-SimGNN: Adaptive and Robust WiFi Localization Across Dynamic Configurations and Diverse Scenarios

Xiao, Qiqi, Ye, Ziqi, He, Yinghui, Liu, Jianwei, Yu, Guanding

arXiv.org Artificial IntelligenceNov-19-2025

To promote the practicality of deep learning-based localization, existing studies aim to address the issue of scenario dependence through meta-learning. However, these studies primarily focus on variations in environmental layouts while overlooking the impact of changes in device configurations, such as bandwidth, the number of access points (APs), and the number of antennas used. Unlike environmental changes, variations in device configurations affect the dimensionality of channel state information (CSI), thereby compromising neural network usability. To address this issue, we propose Meta-SimGNN, a novel WiFi localization system that integrates graph neural networks with meta-learning to improve localization generalization and robustness. First, we introduce a fine-grained CSI graph construction scheme, where each AP is treated as a graph node, allowing for adaptability to changes in the number of APs. To structure the features of each node, we propose an amplitude-phase fusion method and a feature extraction method. The former utilizes both amplitude and phase to construct CSI images, enhancing data reliability, while the latter extracts dimension-consistent features to address variations in bandwidth and the number of antennas. Second, a similarity-guided meta-learning strategy is developed to enhance adaptability in diverse scenarios. The initial model parameters for the fine-tuning stage are determined by comparing the similarity between the new scenario and historical scenarios, facilitating rapid adaptation of the model to the new localization scenario. Extensive experimental results over commodity WiFi devices in different scenarios show that Meta-SimGNN outperforms the baseline methods in terms of localization generalization and accuracy.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.14076

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

mmWave Radar-Based Non-Line-of-Sight Pedestrian Localization at T-Junctions Utilizing Road Layout Extraction via Camera

Park, Byeonggyu, Kim, Hee-Yeun, Choi, Byonghyok, Cho, Hansang, Kim, Byungkwan, Lee, Soomok, Jeon, Mingu, Kim, Seong-Woo

arXiv.org Artificial IntelligenceOct-15-2025

Pedestrians Localization in Non-Line-of-Sight (NLoS) regions within urban environments poses a significant challenge for autonomous driving systems. While mmWave radar has demonstrated potential for detecting objects in such scenarios, the 2D radar point cloud (PCD) data is susceptible to distortions caused by multipath reflections, making accurate spatial inference difficult. Additionally, although camera images provide high-resolution visual information, they lack depth perception and cannot directly observe objects in NLoS regions. In this paper, we propose a novel framework that interprets radar PCD through road layout inferred from camera for localization of NLoS pedestrians. The proposed method leverages visual information from the camera to interpret 2D radar PCD, enabling spatial scene reconstruction. The effectiveness of the proposed approach is validated through experiments conducted using a radar-camera system mounted on a real vehicle. The localization performance is evaluated using a dataset collected in outdoor NLoS driving environments, demonstrating the practical applicability of the method.

artificial intelligence, machine learning, spatial reasoning, (16 more...)

arXiv.org Artificial Intelligence

2508.02348

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.47)

Add feedback

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

Neural Information Processing SystemsOct-9-2025, 02:09:15 GMT

Visual and auditory perception is crucial for observing the world. When we hear a sound, our brain will extract semantic information and locate the sounding source.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Shandong Province (0.04)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

CSIYOLO: An Intelligent CSI-based Scatter Sensing Framework for Integrated Sensing and Communication Systems

Zhang, Xudong, Tan, Jingbo, Ren, Zhizhen, Wang, Jintao, Ma, Yihua, Song, Jian

arXiv.org Artificial IntelligenceSep-25-2025

ISAC is regarded as a promising technology for next-generation communication systems, enabling simultaneous data transmission and target sensing. Among various tasks in ISAC, scatter sensing plays a crucial role in exploiting the full potential of ISAC and supporting applications such as autonomous driving and low-altitude economy. However, most existing methods rely on either waveform and hardware modifications or traditional signal processing schemes, leading to poor compatibility with current communication systems and limited sensing accuracy. To address these challenges, we propose CSIYOLO, a framework that performs scatter localization only using estimated CSI from a single base station-user equipment pair. This framework comprises two main components: anchor-based scatter parameter detection and CSI-based scatter localization. First, by formulating scatter parameter extraction as an image detection problem, we propose an anchor-based scatter parameter detection method inspired by You Only Look Once architectures. After that, a CSI-based localization algorithm is derived to determine scatter locations with extracted parameters. Moreover, to improve localization accuracy and implementation efficiency, we design an extendable network structure with task-oriented optimizations, enabling multi-scale anchor detection and better adaptation to CSI characteristics. A noise injection training strategy is further designed to enhance robustness against channel estimation errors. Since the proposed framework operates solely on estimated CSI without modifying waveforms or signal processing pipelines, it can be seamlessly integrated into existing communication systems as a plugin. Experiments show that our proposed method can significantly outperform existing methods in scatter localization accuracy with relatively low complexities under varying numbers of scatters and estimation errors.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.19335

Country:

Asia > China (0.68)
South America (0.68)
North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Information Technology (0.88)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.87)

Add feedback

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

Li, Jiajie, Sun, Boyang, Di Giammarino, Luca, Blum, Hermann, Pollefeys, Marc

arXiv.org Artificial IntelligenceAug-29-2025

Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.

artificial intelligence, localization, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.20981

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Filters

Collaborating Authors

localization performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

f3f2ff9579ba6deeb89caa2fe1f0b99c-Supplemental-Conference.pdf

637a456d89289769ac1ab29617ef7213-Supplemental-Conference.pdf

From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations

SP-VINS: A Hybrid Stereo Visual Inertial Navigation System based on Implicit Environmental Map

Meta-SimGNN: Adaptive and Robust WiFi Localization Across Dynamic Configurations and Diverse Scenarios

mmWave Radar-Based Non-Line-of-Sight Pedestrian Localization at T-Junctions Utilizing Road Layout Extraction via Camera

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

CSIYOLO: An Intelligent CSI-based Scatter Sensing Framework for Integrated Sensing and Communication Systems

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection